Generalized Collective Inference with Symmetric Clique Potentials

نویسندگان

  • Rahul Gupta
  • Sunita Sarawagi
  • Ajit A. Diwan
چکیده

Many tasks like image segmentation, web page classification, and information extraction can be cast as joint inference tasks in collective graphical models. Such models exploit any inter-instance associative dependence to output more accurate labelings. However existing collective models support very limited kind of associativity — like associative labeling of different occurrences of the same word in a text corpus. This restricts accuracy gains from using such models. In this work we make two major contributions. First, we propose a more general collective inference framework that encourages various data instances to agree on a set of properties of their labelings. Agreement is encouraged through symmetric clique potential functions. We show that known collective models are specific instantiations of our framework with certain very simple properties. We demonstrate that using non-trivial properties can lead to bigger gains, and present a systematic inference procedure in our framework for a large class of such properties. In our inference procedure, we perform message passing on the cluster graph, where property-aware messages are computed with cluster specific algorithms. Ordinary property-oblivious message passing schemes are intractable in such setups. We show that property conformance, as encouraged in our framework, provides an inference-only solution for domain adaptation. Our experiments on bibliographic information extraction illustrate significant test error reduction over unseen domains. Our second major contribution is a suite of algorithms to compute messages from clique clusters to other clusters for a variety of symmetric clique potentials (the clique inference problem). Our algorithms are exact for arbitrary cardinality-based clique potentials on binary labels and for max-like and majority-like clique potentials on multiple labels. For majority-like potentials, we also provide an efficient Lagrangian Relaxation based algorithm that compares favorably with the exact algorithm. Moving towards more complex potentials, we show that clique inference becomes NP-hard for cliques with homogeneous Potts potentials. We present a 13 15 -approximation algorithm with runtime sub-quadratic in the clique size. In contrast, the best known previous guarantee for graphs with Potts potentials is only 1 2 . We perform empirical comparisons on real and synthetic data, and show that our proposed methods for Potts potentials are an order of magnitude faster than the well-known Tree-based re-parameterization (TRW) and graph-cut algorithms. We demonstrate that our Lagrangian Relaxation based algorithm for majority potentials beats the best applicable heuristic, ICM, in a variety of scenarios.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Collective Inference for Extraction MRFs Coupled with Symmetric Clique Potentials

Many structured information extraction tasks employ collective graphical models that capture interinstance associativity by coupling them with various clique potentials. We propose tractable families of such potentials that are invariant under permutations of their arguments, and call them symmetric clique potentials. We present three families of symmetric potentials—MAX, SUM, and MAJORITY. We ...

متن کامل

A General Algorithm for Approximate Inference and Its Application to Hybrid Bayes Nets

The clique tree algorithm is the standard method for doing inference in Bayesian networks. It works by manipulating clique potentials — distributions over the variables in a clique. While this approach works well for many networks, it is limited by the need to maintain an exact representation of the clique potentials. This paper presents a new unified approach that combines approximate inferenc...

متن کامل

Bayesian Inference for Spatial Beta Generalized Linear Mixed Models

In some applications, the response variable assumes values in the unit interval. The standard linear regression model is not appropriate for modelling this type of data because the normality assumption is not met. Alternatively, the beta regression model has been introduced to analyze such observations. A beta distribution represents a flexible density family on (0, 1) interval that covers symm...

متن کامل

Lifted Generalized Dual Decomposition

Many real-world problems, such as Markov Logic Networks (MLNs) with evidence, can be represented as a highly symmetric graphical model perturbed by additional potentials. In these models, variational inference approaches that exploit exact model symmetries are often forced to ground the entire problem, while methods that exploit approximate symmetries (such as by constructing an over-symmetric ...

متن کامل

Exploiting Within-Clique Factorizations in Junction-Tree Algorithms

We show that the expected computational complexity of the Junction-Tree Algorithm for MAPinference in graphical models can be improved. Our results apply whenever the potentials over maximal cliques of the triangulated graph are factored over subcliques. This enlarges the class of models for which exact inference is efficient. Graphs whose potentials factorize The graphical models shown above c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/0907.0589  شماره 

صفحات  -

تاریخ انتشار 2009